Prosper Loan Data Visualization



Bright Alorwoyie


Source: Financial Express



Introdction



Questions for consideration


  1. What factors influence the outcome status of a loan?
  2. What factors influence the borrower's APR or interest rate?
  3. Are there differences in loans based on the initial loan amount?
  4. How does credit score ratings affect the interest charged on loans?
  5. Does term/duration of loan influence the amount of interest charged?

Acquainting with the dataset by visually Inpecting the first five rows of the dataframe, checking the shape and the columns

Choosing specific variables necessary for our analysis

Summary of observations

From the meta info displayed above, it can be observed that

Data Cleaning

What is the structure of your dataset?

What is/are the main feature(s) of interest in your dataset?

The key features of interest in the dataset are:

What features in the dataset do you think will help support your investigation into your feature(s) of interest?

Factors that helped my investigation into the features of interest include LoanStatus, MonthlyPayment and EmploymentStatus

Univariate Exploration

Question



Visualization

Observation

Both Borrowers' APR and Borrowers' Rate tend to have a normal distribution and and densed between 0.1 and 0.2. However, the distribution of their modal values seems different. APR's mode is between 0.1 and 0.2 whereas Borrower's Rate mode is between 0.3 and 0.35

Question



Observation

Most of the borrowers prefer a loan term of 36 months

Question



Observation

Question

Observation

Question

Observation

Lender Yield is observed to be normally distributed, with the majority of values falling between 0.1 and 0.2, with an increase at 0.3

Question

Observation

Most of the borrowers belong to the others and professional category

Question

The documentation of the above plot was sourced from plotly graphing library

Observation

CA has the largest number of borrwers spanning over 13,000

Observation

We can observe that the three states with the highest numbers of loans originated at that time were California (CA), Texas (TX), and New York (NY).

Question

Observation

Question

Observation

Question

Observation

Question

Observation

Question

Observation

Income Ranges of 25,000-49,999 and 50,000-74,999 are the largest group of borrowers.

Question

Observation

From 2009, the number of applicants have seen a continuous increase year on year until 2014 where the number of applicant fell deep

Question

Observation

The presence of people with very high incomes to their debt significantly skewed the distribution of the DebtToIncomeRatio. No adjustments to the data were made to take this into account because it is expected in a real-world scenario. It will be interesting to see how this influences the loan interest rates.

Discuss the distribution(s) of your variable(s) of interest. Were there any unusual points? Did you need to perform any transformations?

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?


Bivariate Exploration



Question

Observation

Question

Observation

Observation

Observation

Observation




The listing categories can be verified by clicking here

Observation


Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?


Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?


Multivariate Exploration


Observation

From the heat map above, I have noted the following:

Question

How does LenderYield and ProsperRating Affect BorrowerAPR?

Observation

Question

What is the relationship between Loan Status and APR with respect to each employment status?

Observation

Question

  1. Which category of defaulters are are charged the highest APR?
  2. Which loan term has the highest defaulters

Observation

Question

Observation

Question

What is the relationship between EmploymentStatus, BorrowerAPR with respect to LoanStatus?

Question

Analyze how the BorrowerRate changes for different loan Terms when split up by ProsperRating

Observation

Question

How does ProsperRating affect BorrowerAPR and LoanAmount?

Observation

The better rating results in a larger loan amount. The better the rating, the lower the APR for the borrower. It's interesting to note that as Prosper ratings rise from HR to A or higher, the correlation between borrower APR and loan amount changes from being negatively to marginally positively. This might be the case since borrowers with A or AA ratings frequently take out larger loans, therefore raising the APR might deter them from doing so and maximize the profit. Lowering the borrower's APR might persuade them to borrow more since people with worse ratings often borrow less money.

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

Relation between EmploymentStatus, LoanStatus and BorrowerAPR:

Were there any interesting or surprising interactions between features?